LIG at MediaEval 2015 Multimodal Person Discovery in Broadcast TV Task
نویسندگان
چکیده
In this working notes paper the contribution of the LIG team (partnership between Univ. Grenoble Alpes and Ozyegin University) to the Multimodal Person Discovery in Broadcast TV task in MediaEval 2015 is presented. The task focused on unsupervised learning techniques. Two different approaches were submitted by the team. In the first one, new features for face and speech modalities were tested. In the second one, an alternative way to calculate the distance between face tracks and speech segments is presented. It also had a competitive MAP score and was able to beat the baseline.
منابع مشابه
PERCOLATTE : A Multimodal Person Discovery System in TV Broadcast for the Medieval 2015 Evaluation Campaign
This paper describes the PERCOLATTE participation to MediaEval 2015 task: “Multimodal Person Discovery in Broadcast TV” which requires developing algorithms for unsupervised talking face identification in broadcast news. The proposed approach relies on two identity propagation strategies both based on document chaptering and restricted overlaid names propagation rules. The primary submission sh...
متن کاملMultimodal Person Discovery in Broadcast TV at MediaEval 2015
We describe the“Multimodal Person Discovery in Broadcast TV” task of MediaEval 2015 benchmarking initiative. Participants were asked to return the names of people who can be both seen as well as heard in every shot of a collection of videos. The list of people was not known a priori and their names had to be discovered in an unsupervised way from media content using text overlay or speech trans...
متن کاملTokyo Tech at MediaEval 2016 Multimodal Person Discovery in Broadcast TV task
This paper describes our diarization system for the Multimodal Person Discovery in Broadcast TV task of the MediaEval 2016 Benchmark evaluation campaign [1]. The goal of this task is naming speakers, who are appearing and speaking simultaneously in the video, without prior knowledge. Our diarization system relies on face diarization approach. We extract deep features from a face every 0.5 secon...
متن کاملCombining Audio Features and Visual I-Vector @ MediaEval 2015 Multimodal Person Discovery in Broadcast TV
This paper describes our diarization system for the Multimodal Person Discovery in Broadcast TV task of the MediaEval 2015 Benchmark evaluation campaign [1]. The goal of this task is naming speakers, who are appearing and speaking simultaneously in the video, without prior knowledge. Our diarization system is based on multimodal approach to combine audio and visual informations. We extract feat...
متن کاملGTM-UVigo System for Multimodal Person Discovery in Broadcast TV Task at MediaEval 2016
In this paper, we present the system developed by GTMUVigo team for the Multimedia Person Discovery in Broadcast TV task at MediaEval 2016. The proposed approach consists in a novel strategy for person discovery which is not based on speaker and face diarisation as in previous works. In this system, the task is approached as a person recognition problem: there is an enrolment stage, where the v...
متن کامل